Accelerated Molecular Dynamics through stochastic iterations to strengthen yield of 

path hopping over upper states (SISYPHUS) 



Pratyush Tiwary and Axel van de Walle 
School of Engineering, Brown University, Providence, Rhode Island 02912, USA 

(Dated: January 3, 2013) 

We present a new method, caUed SISYPHUS (Stochastic Iterations to Strengthen Yield of Path 
Hopping over Upper States), for extending accessible time-scales in atomistic simulations. The 
method proceeds by separating phase space into basins, and transition regions between the basins 
based on a general collective variable (CV) criterion. The transition regions are treated via tra- 
ditional molecular dynamics (MD) while Monte Carlo (MC) methods are used to (i) estimate the 
expected time spent in each basin and (ii) thermalize the system between two MD episodes. In par- 
ticular, an efficient adiabatic switching based scheme is used to estimate the time spent inside the 
basins. The method offers various advantages over existing approaches in terms of (i) providing an 
accurate real time scale, (ii) avoiding reliance on harmonic transition state theory and (iii) avoiding 
the need to enumerate all possible transition events. Applications of SISYPHUS to low temper- 
ature vacancy diffusion in BCC Ta and adatom island ripening in FCC Al are presented. A new 
CV appropriate for such condensed phases, especially for transitions involving collective motions of 
several atoms, is also introduced. 



Achieving usefully long timescales (seconds or longer) 
in atomistic simulations of materials is a problem of great 
interest, and the search for a practical and general solu- 
tion has generated intense activity in the field over last 
several decades^ ^ (see^^ for excellent reviews of these 
and other efforts). The problem arises because, as the 
system moves from one energy basin to another through 
infrequent rare events, it stays trapped in some energy 
basin for extended periods of time. Along with the small 
time steps (on the order of femtoseconds) needed for the 
total energy staying conserved, this severely restricts the 
time scales accessible in MD simulations and also leads 
to limited phase-space exploration. 

Although many methods exist to increase the rate of 
rare events and efficiently explore the rugged free energy 
landscape, a frequent limitation is the inability to effi- 
ciently obtain an accurate estimate of the "real" time 
scale of the simulation in general physical systems. Ex- 
isting schemes to achieve this either 

1. require cataloguing all possible transitions paths 
out of given basiiP^^, which can be computation- 
ally prohibitive in low-symmetry systems, or 

2. rely on harmonic approximations to the system's 
energy surface^^, or 

3. involve computing averages that converge slowly, 
especially for large system sizes, because they in- 
volve exponentials of the system's total energy.'^^ 

In this letter, we propose a new mixed Monte Carlo- 
Molecular Dynamics method that uses a collective vari- 
able (CV) X solely to discriminate between basins and 
transition regions, thus placing very weak requirements 
on the choice of CV, less stringent than what required in 
metadynamics methods^^. The method still provides the 
system's dynamics via conventional atomic coordinates 
and thus provides more detailed information than the 



coarse grained dynamics typically provided by Metady- 
namics methods^ The method also provides an accu- 
rate real time scale that does not deteriorate with system 
size and that does not rely on harmonicity assumptions. 
There is no need to construct a priori or in situ a cat- 
alog of possible transition mechanisms. The method is 
specially suited for exploiting massive parallel comput- 
ing. 

The proposed algorithm generalizes our previous worlP 
along multiple dimensions. First, we use a general CV 
X (instead of the system's potential energy) to discrimi- 
nate between basins and transition regions and propose 
a novel type of CV suitable for this purpose in con- 
densed phases. (Recently proposed dimensionality re- 
duction algorithms'^ that discover CV automatically 
could be used as well.) Second, we introduce an adi- 
abatic switching scheme to efficiently calculate the real 
time spent inside wells. Finally, we use a more robust cri- 
terion to determine when the system has been trapped 
in a basin for sufficiently long time to have equilibrated 
therein. 

Let the system be characterized by position x = 
(ri,...,rAr) and velocity v = {yi^ . . . ^vn)- The CV x 
is a function of x (we give an example of such a function 
later in this letter). For a user-specified cut-off value Xcuti 
we define the basins, or wells, W in this x- space as a set 
of connected states for which x < Xcut- In the wells, the 
method does not follow the system's exact trajectory in 
phase space, but instead provides the expected amount 
of time spent in each well. In contrast, the x ^ Xcut 
region of the phase space contains the interesting but in- 
frequently occurring events whose dynamics is fully de- 
scribed. The choice of Xcut thus specifies the level of de- 
tails one wishes to retain in the simulation. Large values 
of Xcut may cause wells to merge and limit the ability to 
resolve the precise dynamics of some events. The method 
is still formally correct but the definition of the wells W 
may not have an an obvious physical meaning. Ne vert he- 
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less, the method is very robust to changes in Xcut that 
do not change the topology of the well structure (we will 
provide numerical evidence of this fact). 

When the CV x of the system is above the thresh- 
old Xcut 5 the system evolves via MD according to its 
true Hamiltonian with constant-pressure (or volume) or 
constant-temperature (or energy) ensemble as entailed 
by the simulation. When x < Xcut^ the algorithm con- 
tinues performing MD until either (i) x ^ Xcut (in which 
case the system is considered to have exited the well 
and standard MD continues as described above) or (ii) 
a time equal to the system's decorrelation time Tc has 
elapsed. In that latter case, the system is considered to 
have been trapped in the well for long enough that it has 
reached a local thermodynamic equilibrium. This crite- 
rion is similar to the one used in other methods^ to 
define a transition event. At this time, we launch a MC 
simulation (called MCa) whose aim is to generate a new 
random starting position at the well's boundary to ini- 
tialize the next MD episode. MCa is run for long enough 
that the system loses memory of how it entered the well 
and visits the boundary of the well a few times. MD is 
restarted with the position x where the system last vis- 
ited the well's boundary and with velocities v drawn from 
a truncated Maxwell-Boltzmann distribution conditional 
on V ■ Vx(^) > (i.e. we only consider velocities in the 
half-space pointing outwards of the well). 

In parallel to the first MC (MCa) we perform a sec- 
ond MC run (called MCb) that calculates the expected 
time tw the system would have spent inside the well. 
This separation of two MC runs makes our algorithm ex- 
tremely parallelizable and especially amenable to be used 
on loosely coupled cluster of computers. We can launch 
as many MCb runs as we have processors available. These 
runs do not need to communicate with each other, and 
because of the system's ergodicity, we can make a quick 
estimate of the quantity by averaging over these in- 
dependent runs. This parallelization is even simpler than 
for the Parallel Replica methocP^. 

Before we describe how we calculate the time the sys- 
tem would have spent in whichever well it visited, we 
describe specifically how MCa is implemented. We de- 
fine the boundary of the well W with thickness w: 

Sw = {x = (ri, ...,rAr) : \x{x) - Xcut\ < w} (1) 

Trying to visit Sw with an unbiased potential would be 
of no avail, since we will rarely visit states in We 
thus do MC with a biased potential V*{x) defined as 
follows: 
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V*(x) = V{x) 




X{x) > Xcut 
X{x) < Xcut 



(2) 



PerEq. ([2|, MCa never visits the outside of the well W 
and biases states with a penalty function that increases 
with their depth inside the well. Note that the bias is 
zero at the well boundary, which is important to obtain 
the correct sampling distribution of the boundarj/^. The 
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FIG. 1: Diffusion constant for vacancy diffusion in Ta at var- 
ious temperatures as through brute-force MD (circles) and 
SISYPHUS (stars). Errorbars (roughly same as marker size) 
are also provided as obtained over 16 independent runs. 



bias inside the well changes the fraction of time spent at 
the boundary but not the ratio of the times spent at any 
two points of the boundary. The parameter m determines 
how sharp the boundary of the well is (a value around 0.5 
is found to perform well in practice). Vq is kept around 
the standard deviation in potential energy of the system. 
As we demonstrate numerically later, the algorithm is 
very robust with respect to choice of parameters in Eq. 

Having described a way to accelerate the exploration of 
various wells, we now turn to the question of calculating 
(via a Monte Carlo labelled MCb) the expected time the 
system would have spent inside well W if there was no 
acceleration of the dynamics. This time, denoted tw, 
can be calculated as the reciprocal of the flux of states 
exiting the well: 



tw = lim((- l{x e S^)))- 

w^O W 



(3) 



where the average (• • • ) is taken over x drawn from 
the well W with a probability density proportional to 
^-yix)/{kBT) ^ /c^ is Boltzmann's constant, T is the tem- 
perature and 1(A) equals 1 if the event A is true and 
otherwise, v denotes the mean projection of a Maxwell- 
Boltzmann-distributed velocity along the unit vector u 
parallel to Vx(^), conditional on v.\/x{x) < 0. When 
all atoms have the same mass v = ^JkuT /27im (a 
general expression can be found in Ref. [3]). 

A straightforward implementation of Eq. Q will again 
suffer from a rare event problem. Even an importance 
sampling scheme, as suggested in Ref. |3] is not very ef- 
ficient. Consider what happens if we used the biased 
potential as defined in Eq. ([2| to calculate tw- 



tw = lim 



^^0 (^e-/5(^(^)-^*(^))l(x G S^))* 



(4) 



where (•••)* denote expectations taken under a density 
proportional to e~^^ "^^^ in which ^5 is 1/(A:bT). This ap- 
proach is exact in the limit of an ensemble average, but 
there is a fundamental trade-off that limits its usefulness: 
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FIG. 2: (Top) Insensitivity of dynamics to choice of Xcut (relative to average x that temperature) for vacancy diffusion in Ta 
across temperatures, o, o,^ and * denote 500,600,700 and 800K temperatures respectively. (Centre) Corresponding speed-ups 
relative to physical time achieved in brute-force MD run in the same wall-clock time. (Bottom) Insensitivity of dynamics to 
choice of Vq. Errorbars over 16 independent runs for each data point. 



A large bias V*{x) — V{x) leads to a more rapid conver- 
gence of the denominator (due to an increased sampling 
rate of the boundary) but a slower convergence of the 
numerator (due to an increase in V*(x) — V{x)). 

To avoid this problem, we now propose a technique 
that be ars s ome resemblance to adiabatic switching 
methodJ^I^'^, in which the system is continuously, adi- 
abatically switched from V (x) (the true potential) to 
V*{x) (identical to the potential used in MCa). Let 
V (x, a) smoothly interpolate between V (x, 0) = V (x) 
and V (x, 1) = V*{x). Then we can express the ensemble 
average in Eq. (l3| as below (working in terms of rate ) : 



lim 



= lim 



Je-l3y(-fi)dx 



R (5) 



where dx denotes a differential volume in 3-N dimensional 
configuration space for N particles, the integration being 
performed over entire configuration space within the well 
W and the expected value {■ ■ in Eq.([5| is defined by 



(•••)a = 



|(...)e-m^.«)dx 



(6) 



Below we define the term R in Eq.([5| and re-express it 
in a computationally tractable form (see Supplemental 
Materials (SM) for a more detailed derivation): 



R 



exp 




We pick a linear switching scheme for V{x^a)^ i.e. an 
interpolation scheme between V{x^O) and V{x^ 1): 



V{x,a) = (1 -a)V(x) +aV*(x) 



(8) 



We now make a few observations regarding 
Eq.([5|. It involves 2 parts. The first is 

lini„^o(^^i^^^e-''(^(^>°)-^(^>i))\ . This is nonzero 



only when x e S^, and whenever it is nonzero, the 
difference V (x, 0) — V" (x, 1) is very small(see Eqs.([l|2 )). 
Since this average is calculated with the maximally 
biased potential V" (x, 1), the boundary x G visited 
frequently, and thus the first term in Eq.(|5| can be 
evaluated very quickly. The second part in Eq.([5| is R, 

where the average {^^^-^^)a = i^) — V i^)) a does 
not contain any exponentials that could cause a slow 
convergence. In SM, we prove rigorously that for this 
switching scheme and for the choice of biasing potential 
m Eq.(|2| we can use a non-uniform grid to evaluate R 
which can be made finer as a ^ but kept coarse for 
larger a, leading to further computational efficiency. 

For solid-state systems where bond-breaking is the 
dominant mechanism of interest, we take x be the 
bond distortion function (BDF), defined below for a N- 
particle system: 



X{x) ao 




i/p 



(9) 
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where p > 1 and ao is the Kelvin lattice parameter. For 
each bond, r^J is the equilibrium bond length that can be 
obtained by a few conjugate gradient steps each time a 
transition is detected. In the limit that p ^ oo, the BDF, 
which is a p-norm over fractional bond distortions, ap- 
proaches the maximum norm, i.e. the strain (times ao) in 
the maximally strained bond. For p-^ oo we thus recover 
the so-called bond-boost function^. We pick p around 
8-12 for the systems studied in this letter. Not taking 
p = oo allows us to treat on a similar footing cases where 
(a) one bond is distorted by a large amount, or (b) several 
bonds are collectively distorted by a significant amount 
which is however less than the amount in (a). As soon 
as either (a) or (b) happens, the BDF detects it through 
a spike and thus we can then switch back to doing com- 
pletely unbiased MD. This way we can treat transition 
mechanisms involving small but concerted and collective 
motion of several atoms (see Fig. |3]g-h)) - for example 
in glasses, shear transformation zones^^ involve several 
atoms moving displacing together by a small amount. 

We now describe applications of the algorithm that 
validate it and demonstrate its insensitivity to choice of 
parameters. The first example is vacancy assisted lattice 
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FIG. 3: Mechanisms seen by SISYPHUS for adatom movement on Al(OOl) surface. Atoms colored per z-coordinate. 
Blue/orange = substrate atom, green/red=adatom, red/orange = atom with maximum movement. Sohd hues are periodic 
boundaries. 



diffusion at low temperatures in BCC Tantalum. Lat- 
tice diffusion at low temperatures is a problem impor- 
tant in a spectrum of sciences from Materials Science to 
Geolog}'^^ but is beyond the time scales one can ac- 
cess in current MD simulations, requiring times longer 
than milliseconds^*^. We describe the parameters for 
the MD parlP^ of this simulation in SM. In Fig{l] we 
demonstrate how the diffusion constant through brute- 
force MD and SISYPHUS for vacancy assisted diffusion 
lie on the same Arrhenius plot, giving an Arrhenius-type 
activation energy of 0.9(+/-0.1) eV (in rough agreement 
with Harmonic Transition State Theory(HTST) calcula- 
tion of 1.1 eV). Figj2|top) demonstrates the lack of sen- 
sitivity of the dynamics to what Xcut and Vq values we 
pick. As expected the speed-up is higher for higher Xcut 
(Figj2|middle)). At higher temperature we find slightly 
higher sensitivity to Xcut (still within order of magnitude 
accuracy). This is because for a low Xcut^ the values 
becomes closer to time Tc the system takes to equilibrate 
in a well. Insensitivity to Vq values can be seen from 
Figj2|bottom) . A smaller Vq leads to slower sampling in 
Eqj4j and MCa also takes longer to converge. With too 
high a Vq however, the periphery of the well W can be 
too steep and one might again face sampling issues since 
the system can be trapped in some regions of the well 
boundary. In SM, we provide a back-of-t he-envelope es- 
timate of how to pick an optimum Vq for a given system. 

For our second application, we studied the room- 
temperature dynamics of Al adatoms on Al (001) thin 
film (625 atoms, roughly 3 times larger than the Ta va- 
cancy diffusion example). We picked this problem be- 
cause firstly, from a technological perspective, it is of im- 
mense importance for fabrication processes in nanoscale 
devices involving growth of thin films from deposited 
adatoms^^^^. This problem is very interesting from a 
theoretical perspective too, given that it is an inherently 
non-equilibrium phenomenon dictated by the interplay 
between kinetics and thermodynamics. Being able to 
model and control the growth and properties of such films 
is hugely desirable - the time-scales needed are however 
far beyo nd M D. Accurate Kelvin saddle point search 
methodd^^ have shown the existence of a large number 
of transition mechanisms with low and similar activation 
energies (smaller than 0.4eV). Such low lying barriers can 
be hard to deal with in most accelerated MD methods^^. 
On-the-fly KMC^ calculations have been used previously 




(a)t=0 ns (0 eV) (b)t=l /is (-1.6 eV) (c)t=2 ms (-1.8 eV) 
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FIG. 4: Room-temperature adatom ripening on Al (001) 
surface as a function of time. Each structure was quenched 
to find its corresponding local minima, and the energies af- 
ter quenching (relative to (a)) are provided, (a) shows the 
starting geometry. After 1/is, we have two disconnected clus- 
ters (b-c). Corresponding brute- force MD runs were found 
trapped in similar configuration in the fraction of microsec- 
ond time they could achieve. At around 4ms these two clus- 
ters join (d). At 9ms (e) there is further joining and we have 
effectively one long chain of adatoms. At 15ms (f) the chain 
has coarsened into one entity across simulation cells. Color 
scheme same as in Fig. 3. 



used to get rough estimate of the time-scale for adatom 
island ripening that we can compare SISYPHUS with. In 
SM we provide details of the simulation parameters for 
this system. 

Fig. [3] and the movies in SM illustrate the most com- 
mon mechanisms seen through SISYPHUS. We have the 
single adatom hop (a-b), the concerted two adatom hop 
(c-d), and the concerted event involving an adatom and 
a substrate atom as the latter moves to the adatom 
layer (e-f). Occasionally we see more complicated mecha- 
nisms like the 3-atom hop (g-h), and events with creation 
of a vacancy in the top substrate layer (see SM). The 
first three mechanisms are the most common and are in 
fact the lowest energy transitions found using the dimer 
method for saddle point search. In Fig. |4]and movie in 
SM we show the typical evolution of the island ripening 
at room temperature over several milliseconds of physi- 
cal time (exact time can vary from run to run well but 
still within an order of magnitude). Shown alongwith are 
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corresponding energies obtained by quenching each struc- 
ture to its local minima, illustrating further the effective- 
ness of the algorithm in escaping and exploring various 
local energy minima. The overall time can be compared 
with the analogous on-the-fly KMC work^ for same sys- 
tem and interatomic potential^^ where 20 adatoms (half 
as many as current work) formed one compact cluster 
around 1ms. As a verification that our proposed BDF is 
effective to decomposing phase space into disconnected 
wells, we show, in a movie accompanying the SM, a su- 
perposition of snapshots of the system during a MCa run, 
illustrating that the system does not jump from one well 
to another within one MCa run (since it rejects all moves 
with X > Xcut)- 

In conclusion, we have shown SISYPHUS to be an ex- 
tremely parallelizable and robust set of algorithms that 
help achieve fraction of second timescale for thousands 



of atoms. The method works well irrespective of system 
size, and can be applied in the general setting of any 
collective variable. We also introduced a new CV appro- 
priate for solid state systems especially for transitions 
involving collective motions of several atoms. 
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